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Question 6 


Intent of Question 
The primary goals of this investigative task were to assess a student’s ability to (1) identify and conduct an 
appropriate inference based on the differences in the posttest and pretest scores; (2) identify and interpret 


appropriate information from statistical software; (3) make an inference based on separate regression analyses; 
and (4) recognize and explain the additional information provided from the different analyses. 


Solution 
Part (a): 
Component 1: States a correct pair of hypotheses. 


We want to test Ho : Upigu = pio Versus Hy : Upigu > pio» Where Lpigy 18 the mean difference 
(posttest — pretest) for all students at the magnet school and “,,,. is the mean difference (posttest — 
pretest) for all students who applied to attend the magnet school but were not selected and then attended 
the original school. 


Component 2: Identifies a correct test (by name or formula) and checks the conditions. 


— Xx, 

A two-sample t-test for means, or ft = > 
S S 
2M + 2oO° 
Ny BO 


1. We need to assume randomness of the sampling used. It was stated in the stem that the students from the 
two different schools were randomly selected. 


2. We need to check the assumption that the distributions of differences (posttest — pretest) for each of the 
two schools are normally distributed. Based on histograms and boxplots of these differences, there are no 
outliers or extreme skewness. Because these graphs reveal no obvious departures from normality, it 
appears reasonable to proceed with the f-test. 
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Question 6 (continued) 


Component 3: Performs correct mechanics, which include the value of the test statistic and p-value (or 








Bee: : Xu -Xo _ LL 3 = 
rejection region): t = 5 5 = 2.487 
SM, 50 [88:58 , 15.84 
fig <p 8 12 


with a (one-sided) p-value ~ 0.0177, df = 8.69. 


Component 4: Draws an appropriate conclusion in context and with linkage to the p-value (or rejection 
region): 


Using @ = 0.05, we reject Hy) because 0.0177 < 0.05. We conclude that the sample data provide 


convincing evidence that students who attend the magnet school have a higher mean difference in test 
scores than students who attend the original school. 


Part (b): 
Let y = posttest score and x = pretest score. 


(i). The predicted regression equation for the magnet school is y = 73.27 + 0.1811x. For students at the 


magnet school, a 1-point increase in the pretest score is associated with a predicted increase of 0.1181 points 
on the posttest (i.e., the slope is positive but close to zero). 


(ii). The predicted regression equation for the original school is y = 9.24 + 0.9204. . For students at the 


original school, a 1-point increase in the pretest score is associated with a predicted increase of 0.9204 points 
on the posttest (i.e., the slope is positive and close to 1). 


Part (c): 


(1). The test statistic is t = 0.40 with a p-value of 0.706. Because the p-value is greater than any reasonable 
significance level, say 0.05, we fail to reject H). We conclude that there is insufficient evidence to state that 


pretest score is a significant predictor of posttest score at the magnet school. The data do not support a 
conclusion that a correlation exists between pretest and posttest scores at the magnet school. 


(11). The test statistic is t= 6.09 with a p-value of 0.000. Because the p-value is less than any reasonable 
significance level, say 0.05, we reject Hy and conclude that there is sufficient evidence to state that pretest 


score is a significant predictor of posttest score at the original school. The data support a conclusion that a 
correlation exists between pretest and posttest scores at the original school. 


Part (d): 


Unlike the two-sample analysis of differences in part (a), the regression analyses allow us to explore the 
relationship between pretest and posttest scores at each school. From the regression output and graph, we see 
that students with low pretest scores benefit more from attending magnet schools, as compared with students 
with low pretest scores at the original school. Also at the magnet school, students with low pretest scores 
benefit more than students with high pretest scores. In other words, students at the magnet school all score 
high on the posttest, regardless of how they scored on the pretest. But at the original school, only students 
who scored high on the pretest scored high on the posttest. 
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Question 6 (continued) 

Scoring 
Parts (a), (b), (c), and (d) are scored as essentially correct (E), partially correct (P), or incorrect (1). 
Part (a) is scored as follows: 

Essentially correct (E) if all four components are correct. 

Partially correct (P) if two or three components are correct. 

Incorrect (1) if at most one component is correct. 
Part (b) is scored as follows: 


Essentially correct (E) if all four components—both equations and both interpretations in (i) and (ii)—are 
correct. 


Partially correct (P) if two or three components are correct. 
Incorrect (I) if at most one component is correct. 

Part (c) is scored as follows: 
Essentially correct (E) if all four components—both p-values and both conclusions in (1) and (1i)—are correct. 
Partially correct (P) if two or three components are correct. 
Incorrect (I) if at most one component is correct. 

Part (d) is scored as follows: 
Essentially correct (E) if the response clearly explains how the regression analyses provide additional 
information in this context by addressing the impact of the magnet school on students with low pretest 
scores. 
Partially correct (P) if the response clearly describes how the regression analyses provide additional 
information in context but does not explain the impact of the magnet school on students with low pretest 


Scores. 


Incorrect (1) if the response does not meet the criteria for an E or P. 
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Question 6 (continued) 


Each essentially correct (E) response counts as | point. Each partially correct (P) response counts as 5 point. 


4 Complete Response 
3 Substantial Response 
2 Developing Response 
1 Minimal Response 


If a response is between two scores (for example, 25 points) use a holistic approach to determine whether to 


score up or down, depending on the overall strength of the response and communication. 
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(a) Perform a test to determine whether students who attend the magnet school demonstrate a significantly 
higher mean difference in test scores (Posttest — Pretest) than students who applied to attend the magnet 
school but who were not selected and then attended their original school. 
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(b) (i) State the equation of the regression line for the magnet school and interpret its slope in the context of 
the question. 


G= ,19thxt 13.97 


lpe= . 18 UI | | 
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(ii) State the equation of the regression line for the original school and interpret its slope in the context of 
the question. 
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(c) To determine whether there is a significant correlation between pretest score and posttest score, a test of the 
following hypotheses will be performed. 


~ Hy: There is no correlation between pretest score and posttest score (true slope = 0) 
versus 
H, : There is a correlation between pretest score and posttest score (true slope # 0) 


(i) Using the regression output, state the p-value and conclusion for this test at the magnet school. 
Assume the conditions for inference have been met. 


QP YoAre: © 1b 
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(ii) Using the regression output, state the p-value and conclusion for this test at the original school. 
Assume the conditions for inference have been met. 
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(d) What additional information do the regression analyses give you about student performance on the science 
test at the two schools beyond the comparison of mean differences in part OAK rut c Ch ww | 
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. (a) Performa test.to determine whether students-who attend. the magnet school demonstrate‘a significantly 
higher mean, difference i in test scores (Posttest. ~ Pretest) than students who applied to-attend the magnet 
schoal but who were not selected and then attended thelr aa school.. 
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(b) (i) State the equation of the regression line for the magnet school and interpret its slope in-the context of 
the question. 


y* 73,27 + .18lle 


As pretest score mecreases by 1 porn: . posttest score 
tS - expected ws Inchease by .18t] porhrs, 


(ii) State the equation of the regression line for the original school and interpret its. slope in the context of 
the question. , 


72. 9.04 + 204% 
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| 
store ts expected to merease by 9204 porats, 


GO ON TO THE NEXT PAGE. 
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(c) To determine whether there is a significant correlation between pretest score and posttest score, a test of the 
following hypotheses will be performed. 


Hy : There is no correlation between pretest score and posttest score (true slope = 0} 
versus 
' H,: There is a correlation between pretest score and posttest score (true slope # 0) 


(i) Using the regression output, state the p-value and conclusion for this test at the magnet. school. 
Assume the conditions for inference have been met. 


The p-valweis . 706, which os gfeater hen .0S,, 50 
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(ii) Using the regression output, state the p-value and conclusion for this test at the original school. 
Assume the conditions for inference have been met. 
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(d) What additional information do the regression analyses give you about student performance on the science 
test at the two schools beyond the comparison of mean differences in part (a) ? 
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(a) Perform a test to determine whether students who attend the magnet school demonstrate a significantly 
higher mean difference in test scores (Posttest — Pretest) than students who applied to attend the magnet 


school but who were not selected and then attended their original school. : 
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(b). (i) State the equation of the regression line for the magnet school and interpret its slope in the context of ib C 2, 
the question. , 


Y= “basco | +0-1F 11 X 
Slope f= OIF. 


For every muUeare In X ( point nthe predts+ 
s~ve), the port Hest (wre MnMeaces by the 


ctope . O-LEUL pots 


Gi) State the equation of the regression line for the original school and interpret its slope in the context of 
the question. ~ 


Ys 424 + 0.4204 x 
Slope f= 0.9204 
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| 603 
(c) To determine whether there is a significant correlation between pretest score and posttest score, a test of the 
following hypotheses will be performed. 


Hy: There is no correlation between pretest score and posttest score (true slope = 0) 
versus : 
H, : There is a correlation between pretest score and posttest score (true slope # 0) 


(i) Using the regression output, state the p-value and conclusion for this test at the magnet school. 
Assume the conditions for inference have been met. 


Ho’ fb =0 p= 0.7063 

Ha P #9 sine G7 X=S7%, the nut hypo cannut le rejected. 
There fove, there tS HOHitttot een thas Suggests 
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poHtlest Stores Ot ve WAhgGner Shoo 


(ii) Using the regression output, state the p-value and conclusion for this test at the original school. 
Assume the conditions for inference have been met. 
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(d) What additional information do the-regression analyses give you about student performance on the science 
test at the two schools beyond the comparison of mean differences in part (a).? 
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Question 6 
Overview 


The primary goals of this investigative task were to assess a student’s ability to (1) identify and conduct an 
appropriate inference based on the differences in the posttest and pretest scores; (2) identify and interpret 
appropriate information from statistical software; (3) make an inference based on separate regression analyses; 
and (4) recognize and explain the additional information provided from the different analyses. 


Sample: 6A 
Score: 4 


This response is correct, complete, and well expressed in all parts. The hypotheses in part (a) are stated correctly, 
with the parameter symbols defined clearly as the means of the differences. The two-sample f-test is identified by 
name at the top of the page. The random sampling condition is mentioned, and the normality condition is checked by 
examining boxplots. The response indicates that the sample sizes are small enough to require normality of the 
differences in order to apply the two-sample t-test. The mechanics of calculating the test statistic and p-value are 
correct; notice that the formula for calculating the test statistic is not necessary. The conclusion is very well 
expressed in context, being carefully worded in terms of “the mean difference in test scores.” 


The equations in (b) are reported correctly, using good y notation to denote a predicted value. The variables are not 
clearly defined, but the response makes clear which variable (pretest score) is represented by x and which (posttest 
score) is represented by y. The interpretations of slope are both good. The interpretation in (i) uses the phrase “on 
average” to convey randomness/variability, while the interpretation in (b) uses the phrase “expect to.” 


The p-values in part (c) are reported correctly, and the conclusions are presented clearly. The response is not required 
to show linkage between the p-value and conclusion, because the question simply asks students to “state the p-value 
and conclusion” rather than to conduct a full hypothesis test. 


The response to part (d) indicates what the regression analyses reveal about student performance on the science test 
at these two schools. The first sentence captures the essential point by observing that “a magnet school student that 
scores poorly on the pretest is likely to score much higher on the post test.” The last three sentences of the response 
to this part give a specific example, pointing out that a student with a pretest score of 65 is expected to score an 85 on 
the posttest in the magnet school and only a 69 in the original school. The use of the words “likely” and “expected 
to” in these sentences indicates a recognition and understanding of regression lines as models. 


This answer was judged complete in all its parts and especially impressive for its clear communication and concise 
description of the fundamental point in part (d). It merited 4 points. 


© 2008 The College Board. All rights reserved. 
Visit the College Board on the Web: www.collegeboard.com. 


AP® STATISTICS 
2008 SCORING COMMENTARY 


Question 6 (continued) 


Sample: 6B 
Score: 3 


The response in part (a) contains vague hypotheses. The two groups should be defined as the magnet school and 
original school, respectively, and ideally the “py,” and “1” symbols would be defined. The response makes no 
effort to identify the appropriate test (by name or formula), and no conditions are stated or checked. The test 
statistic and p-value are calculated correctly, and the conclusion is correct in context with linkage to the p-value. 
Part (a) was scored as partially correct. 


The response for part (b) is quite good. The equation of the regression line is provided, carefully expressed in 
terms of y rather than simply y. The variables y and x should be defined, but the slope interpretation demonstrates 
an understanding of what each symbol represents. The phrase “expected to” is a good way to indicate 
randomness/variability in the slope interpretation. Part (b) was scored as essentially correct. 


The response to part (c) correctly identifies the two p-values and states the appropriate conclusion in context. The 
p-values are compared to a significance level of a = 0.05. Part (c) was scored as essentially correct. 


The response to part (d) indicates that the magnet and original schools have different impacts on students, 
depending on their pretest scores. The student describes this in dramatic terms for the magnet school, writing that 
“some saw their scores skyrocket, others found them remain relatively constant.” What is lacking in this response 
is an indication that the students with Jow pretest scores are the ones whose scores improve dramatically. Owing 
to this omission, part (d) was scored as partially correct. 


This response contained two parts that were deemed essentially correct and two parts that were assessed as 
partially correct. Overall, the answer was considered substantial but not complete; it earned 3 points. 
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Question 6 (continued) 


Sample: 6C 
Score: 2 


The hypotheses in part (a) are presented correctly in symbols. The accompanying sentence is correct for the 
alternative hypothesis but not for the null hypothesis. The two-sample f-test is identified correctly, and the 
condition of “simple random samples” is mentioned. The normality condition is mentioned but not checked, so 
the component for identifying the test and checking conditions was not considered to be met. The mechanics are 
correct, as is the conclusion. Part (a) was scored as partially correct. 


The equations reported in part (b) are correct, but the variable names are not provided. The slope interpretations 
make clear which variable is the pretest score and which is the posttest score. However, the interpretations make no 
mention of randomness/variability, making the relationships sound deterministic. Part (b) was scored as partially 
correct. 


The p-values are reported correctly in part (c). In fact, it seems that the p-value was determined by using a calculator, 
because the p-values include more significant figures than the output provided. The conclusions are correct and in 
context. The hypotheses are provided, as is linkage between the p-value and conclusion, although neither of those is 
required here. The conclusion in (i) would be stronger if it were not essentially accepting Ho by saying that “there is 
no correlation between pretest scores and posttest scores at the magnet school.” Despite this concern, part (c) was 
scored as essentially correct. 


The response in part (d) contradicts the correct conclusion from part (a) by stating that “students at the original 
school actually improved much more than those at the model school.” The student appears to be confusing the 
steepness of the regression lines (for predicting posttest from pretest) with the amount of improvement 
(posttest — pretest). This response was scored as incorrect. 


With one essentially correct part and two partially correct parts, this answer was considered a developing response 
and received 2 points. 
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